Model Selection

ImageNet Pretrained

# ImageNet Pretrained

This is a KL16 variational autoencoder (VAE) model trained on the ImageNet-1k dataset for image-to-image conversion tasks.

Image Generation

Pyramid Vision Transformer (PVT) is a vision model based on transformer architecture, specifically designed for image classification tasks.

Image Classification

Efficientnet B6

EfficientNet is a mobile-friendly pure convolutional model that uniformly scales depth/width/resolution dimensions through compound coefficients, trained on the ImageNet-1k dataset.

Image Classification

Efficientnet B1

EfficientNet is a mobile-friendly pure convolutional neural network that achieves efficient scaling by uniformly adjusting depth/width/resolution dimensions through compound coefficients.

Image Classification

Mobilenet V2 1.4 224

MobileNet V2 is a lightweight convolutional neural network designed for mobile devices, excelling in image classification tasks.

Image Classification

Mobilenet V2 1.0 224

MobileNet V2 is a lightweight convolutional neural network designed for mobile devices, excelling in image classification tasks.

Image Classification

Mobilenet V1 1.0 224

MobileNet V1 is a lightweight convolutional neural network designed for mobile and embedded vision applications, pretrained on the ImageNet-1k dataset.

Image Classification

LeViT-128S is a vision Transformer model pretrained on the ImageNet-1k dataset, combining the advantages of convolutional networks for faster inference.

Image Classification

LeViT-128 is an image classification model based on the Vision Transformer architecture, achieving efficient inference by combining the advantages of convolutional networks.

Image Classification

LeViT-256 is an efficient vision model based on Transformer architecture, designed for fast inference and pretrained on the ImageNet-1k dataset.

Image Classification

LeViT-384 is a vision Transformer model pre-trained on the ImageNet-1k dataset, combining the advantages of convolutional networks for faster inference speed.

Image Classification

CvT-21 is an image classification model based on the Convolutional Vision Transformer architecture, pretrained on the ImageNet-1k dataset at a resolution of 384x384.

Image Classification

CvT-13 is a vision transformer model pre-trained on the ImageNet-1k dataset, improving the performance of traditional vision transformers by introducing convolutional operations.

Image Classification

RegNet model trained on ImageNet-1k, an efficient vision model designed through neural architecture search

Image Classification

RegNet model trained on ImageNet-1k, an efficient vision model designed through neural architecture search

Image Classification

Deep residual network model pretrained on the ImageNet-1k dataset, using the improved v1.5 architecture

Image Classification

ResNet-34 is a convolutional neural network based on residual learning, designed for image classification tasks and pretrained on the ImageNet-1k dataset.

Image Classification

Swin Small Patch4 Window7 224

Swin Transformer is a hierarchical window-based vision Transformer model designed for image classification tasks, with computational complexity linearly related to input image size.

Image Classification

Swin Tiny Patch4 Window7 224

Swin Transformer is a hierarchical vision Transformer that achieves linear computational complexity by computing self-attention within local windows, making it suitable for image classification tasks.

Image Classification

Deit Tiny Distilled Patch16 224

This model is a distilled version of the Data-efficient image Transformer (DeiT), pretrained and fine-tuned on ImageNet-1k at 224x224 resolution, efficiently learning from a teacher model through distillation.

Image Classification

Swin Large Patch4 Window7 224

Swin Transformer is a hierarchical vision Transformer that achieves linear computational complexity by computing self-attention within local windows, making it suitable for image classification and dense recognition tasks.

Image Classification

Deit Base Patch16 224

DeiT is a data-efficient image Transformer model trained with attention mechanisms, pretrained and fine-tuned on the ImageNet-1k dataset at 224x224 resolution.

Image Classification

Deit Tiny Patch16 224

DeiT is an efficiently trained vision Transformer model, pretrained and fine-tuned on the ImageNet-1k dataset, suitable for image classification tasks.

Image Classification

Swin Base Patch4 Window12 384

Swin Transformer is a hierarchical vision transformer based on shifted windows, specifically designed for image classification tasks, with computational complexity linear to input image size.

Image Classification

Vit Base Patch16 224 In21k

A Vision Transformer model pretrained on the ImageNet-21k dataset for image classification tasks.

Image Classification

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase